Some macro quantitative features of low-frequency word classes

نویسندگان

  • Fengxiang Fan
  • Yaqin Wang
  • Zhao Gao
چکیده

This contribution examines the macro quantitative features of 15 lowfrequency word classes. The relationship between word frequency classes and the sizes of the frequency classes obeys Altmann’s power law, and the sizes of lowfrequency word classes increase along with the increase of text length. The relationship between text length and the sizes of low-frequency word classes also obeys Altmann’s law. For text of the same length, the relationship between vocabulary size and the sizes of hapax legomena and dis legomena is linear, but this sort of relationship does not hold for other low-frequency word classes. The relationship between vocabulary/low-frequency word class ratio and text length can be captured with reparametrized Tuldava’s model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Role of Derivational Processes in the Formation of Non-Taxonomic Classes of Lexical Units in Russian

The paper is focused on classes of lexical units which arise as a result of derivational processes – word formation and semantic transfers, acting either in isolation or together, on the basis of common semantic foundations that bind targets and sources of derivation. The lexical items which constitute the classes under study vary in their denotative characteristics and due to their categ...

متن کامل

The Vocabulary Profile of Iranian English Teaching School books

This paper provides a fairly detailed corpus-based vocabulary profile of the Iranian EFL books used in public schools. To this end, the WordPerfect files of all the seven books were converted to text format to get rid of the formatting features and be compatible with the software used for analysis. The software tools used were the Compleat Lexical Tutor suite, version 6.2 (Cobb, 2011), AntConc ...

متن کامل

The Effect of Word Meaning on Speech DysFluency in Adults with Developmental Stuttering

Objectives: Stuttering is one of the most prevalent speech and language disorders. Symptomology of stuttering has been surveyed from different aspects such as biological, developmental, environmental, emotional, learning and linguistic. Previous researches in English-speaking people have suggested that some linguistic features such as word meanings may play a role in the frequency of speech non...

متن کامل

Vocabulary Lists for EAP and Conversation Students

Despite the abundance of research investigating general and academic vocabularies and developing dozens of word lists, few studies have compared academic vocabulary with general service word lists such as conversation vocabulary. Many EAP researchers assume that university students need to know all the words in West’s (1953) General Service List (GSL) as a prerequisite to academic words (e.g., ...

متن کامل

رویکردی با ناظر در استخراج واژگان کلیدی اسناد فارسی با استفاده از زنجیره‌های لغوی

Keywords are the main focal points of interest within a text, which intends to represent the principal concepts outlined in the document. Determining the keywords using traditional methods is a time consuming process and requires specialized knowledge of the subject. For the purposes of indexing the vast expanse of electronic documents, it is important to automate the keyword extraction task. S...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Glottometrics

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2014